Overview

Dataset statistics

Number of variables33
Number of observations59695
Missing cells236
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.0 MiB
Average record size in memory264.0 B

Variable types

Numeric13
Categorical20

Warnings

Country has a high cardinality: 155 distinct values High cardinality
Agent has a high cardinality: 296 distinct values High cardinality
Company has a high cardinality: 287 distinct values High cardinality
ReservationStatusDate has a high cardinality: 902 distinct values High cardinality
IsCanceled is highly correlated with ReservationStatusHigh correlation
ReservationStatus is highly correlated with IsCanceledHigh correlation
Adults is highly skewed (γ1 = 20.93767433) Skewed
PreviousCancellations is highly skewed (γ1 = 23.97962684) Skewed
PreviousBookingsNotCanceled is highly skewed (γ1 = 23.41816044) Skewed
df_index has unique values Unique
LeadTime has 3194 (5.4%) zeros Zeros
StaysInWeekendNights has 25912 (43.4%) zeros Zeros
StaysInWeekNights has 3814 (6.4%) zeros Zeros
PreviousCancellations has 56480 (94.6%) zeros Zeros
PreviousBookingsNotCanceled has 57855 (96.9%) zeros Zeros
BookingChanges has 50647 (84.8%) zeros Zeros
DaysInWaitingList has 57831 (96.9%) zeros Zeros
ADR has 967 (1.6%) zeros Zeros
TotalOfSpecialRequests has 35108 (58.8%) zeros Zeros

Reproduction

Analysis started2021-01-30 01:23:57.063961
Analysis finished2021-01-30 01:24:50.555673
Duration53.49 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct59695
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59674.85957
Minimum2
Maximum119387
Zeros0
Zeros (%)0.0%
Memory size466.5 KiB
2021-01-29T22:24:50.692585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5987.7
Q129892.5
median59526
Q389425
95-th percentile113205.3
Maximum119387
Range119385
Interquartile range (IQR)59532.5

Descriptive statistics

Standard deviation34386.1497
Coefficient of variation (CV)0.5762250627
Kurtosis-1.198923187
Mean59674.85957
Median Absolute Deviation (MAD)29772
Skewness0.00104454835
Sum3562290742
Variance1182407291
MonotocityNot monotonic
2021-01-29T22:24:50.873626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122821
 
< 0.1%
601681
 
< 0.1%
192281
 
< 0.1%
253691
 
< 0.1%
274161
 
< 0.1%
69341
 
< 0.1%
28361
 
< 0.1%
130751
 
< 0.1%
151221
 
< 0.1%
110241
 
< 0.1%
Other values (59685)59685
> 99.9%
ValueCountFrequency (%)
21
< 0.1%
31
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
ValueCountFrequency (%)
1193871
< 0.1%
1193861
< 0.1%
1193831
< 0.1%
1193821
< 0.1%
1193811
< 0.1%

IsCanceled
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
0
37738 
1
21957 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters59695
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1
ValueCountFrequency (%)
037738
63.2%
121957
36.8%
2021-01-29T22:24:51.212701image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:51.309723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
037738
63.2%
121957
36.8%

Most occurring characters

ValueCountFrequency (%)
037738
63.2%
121957
36.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number59695
100.0%

Most frequent character per category

ValueCountFrequency (%)
037738
63.2%
121957
36.8%

Most occurring scripts

ValueCountFrequency (%)
Common59695
100.0%

Most frequent character per script

ValueCountFrequency (%)
037738
63.2%
121957
36.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII59695
100.0%

Most frequent character per block

ValueCountFrequency (%)
037738
63.2%
121957
36.8%

LeadTime
Real number (ℝ≥0)

ZEROS

Distinct472
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean103.5098249
Minimum0
Maximum629
Zeros3194
Zeros (%)5.4%
Memory size466.5 KiB
2021-01-29T22:24:51.430750image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q118
median69
Q3159
95-th percentile319
Maximum629
Range629
Interquartile range (IQR)141

Descriptive statistics

Standard deviation106.7560362
Coefficient of variation (CV)1.031361383
Kurtosis1.727359234
Mean103.5098249
Median Absolute Deviation (MAD)60
Skewness1.357370779
Sum6179019
Variance11396.85127
MonotocityNot monotonic
2021-01-29T22:24:51.596787image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03194
 
5.4%
11766
 
3.0%
21024
 
1.7%
3922
 
1.5%
4838
 
1.4%
5776
 
1.3%
6728
 
1.2%
7643
 
1.1%
8583
 
1.0%
12558
 
0.9%
Other values (462)48663
81.5%
ValueCountFrequency (%)
03194
5.4%
11766
3.0%
21024
 
1.7%
3922
 
1.5%
4838
 
1.4%
ValueCountFrequency (%)
62910
< 0.1%
62614
< 0.1%
6229
< 0.1%
6159
< 0.1%
6086
< 0.1%

ArrivalDateYear
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
2016
28375 
2017
20274 
2015
11046 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters238780
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2017
3rd row2016
4th row2016
5th row2016
ValueCountFrequency (%)
201628375
47.5%
201720274
34.0%
201511046
 
18.5%
2021-01-29T22:24:51.916859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:52.014881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
201628375
47.5%
201720274
34.0%
201511046
 
18.5%

Most occurring characters

ValueCountFrequency (%)
259695
25.0%
059695
25.0%
159695
25.0%
628375
11.9%
720274
 
8.5%
511046
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number238780
100.0%

Most frequent character per category

ValueCountFrequency (%)
259695
25.0%
059695
25.0%
159695
25.0%
628375
11.9%
720274
 
8.5%
511046
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common238780
100.0%

Most frequent character per script

ValueCountFrequency (%)
259695
25.0%
059695
25.0%
159695
25.0%
628375
11.9%
720274
 
8.5%
511046
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII238780
100.0%

Most frequent character per block

ValueCountFrequency (%)
259695
25.0%
059695
25.0%
159695
25.0%
628375
11.9%
720274
 
8.5%
511046
 
4.6%

ArrivalDateMonth
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
August
6916 
July
6302 
May
5931 
October
5589 
April
5535 
Other values (7)
29422 

Length

Max length9
Median length6
Mean length5.904028813
Min length3

Characters and Unicode

Total characters352441
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJuly
2nd rowApril
3rd rowFebruary
4th rowOctober
5th rowJuly
ValueCountFrequency (%)
August6916
11.6%
July6302
10.6%
May5931
9.9%
October5589
9.4%
April5535
9.3%
June5448
9.1%
September5238
8.8%
March4908
8.2%
February4099
6.9%
December3419
5.7%
Other values (2)6310
10.6%
2021-01-29T22:24:52.327951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
august6916
11.6%
july6302
10.6%
may5931
9.9%
october5589
9.4%
april5535
9.3%
june5448
9.1%
september5238
8.8%
march4908
8.2%
february4099
6.9%
december3419
5.7%
Other values (2)6310
10.6%

Most occurring characters

ValueCountFrequency (%)
e47823
13.6%
r39197
 
11.1%
u32633
 
9.3%
b21703
 
6.2%
a20842
 
5.9%
y19284
 
5.5%
t17743
 
5.0%
J14702
 
4.2%
c13916
 
3.9%
A12451
 
3.5%
Other values (16)112147
31.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter292746
83.1%
Uppercase Letter59695
 
16.9%

Most frequent character per category

ValueCountFrequency (%)
e47823
16.3%
r39197
13.4%
u32633
11.1%
b21703
 
7.4%
a20842
 
7.1%
y19284
 
6.6%
t17743
 
6.1%
c13916
 
4.8%
m12015
 
4.1%
l11837
 
4.0%
Other values (8)55753
19.0%
ValueCountFrequency (%)
J14702
24.6%
A12451
20.9%
M10839
18.2%
O5589
 
9.4%
S5238
 
8.8%
F4099
 
6.9%
D3419
 
5.7%
N3358
 
5.6%

Most occurring scripts

ValueCountFrequency (%)
Latin352441
100.0%

Most frequent character per script

ValueCountFrequency (%)
e47823
13.6%
r39197
 
11.1%
u32633
 
9.3%
b21703
 
6.2%
a20842
 
5.9%
y19284
 
5.5%
t17743
 
5.0%
J14702
 
4.2%
c13916
 
3.9%
A12451
 
3.5%
Other values (16)112147
31.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII352441
100.0%

Most frequent character per block

ValueCountFrequency (%)
e47823
13.6%
r39197
 
11.1%
u32633
 
9.3%
b21703
 
6.2%
a20842
 
5.9%
y19284
 
5.5%
t17743
 
5.0%
J14702
 
4.2%
c13916
 
3.9%
A12451
 
3.5%
Other values (16)112147
31.8%

ArrivalDateWeekNumber
Real number (ℝ≥0)

Distinct53
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.13991122
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Memory size466.5 KiB
2021-01-29T22:24:52.490988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q116
median28
Q338
95-th percentile49
Maximum53
Range52
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.62353557
Coefficient of variation (CV)0.5019742128
Kurtosis-0.9895419997
Mean27.13991122
Median Absolute Deviation (MAD)11
Skewness-0.005445386918
Sum1620117
Variance185.6007214
MonotocityNot monotonic
2021-01-29T22:24:52.665018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
331791
 
3.0%
321517
 
2.5%
301513
 
2.5%
181483
 
2.5%
211471
 
2.5%
341462
 
2.4%
281417
 
2.4%
291415
 
2.4%
171412
 
2.4%
311408
 
2.4%
Other values (43)44806
75.1%
ValueCountFrequency (%)
1524
0.9%
2618
1.0%
3656
1.1%
4733
1.2%
5694
1.2%
ValueCountFrequency (%)
53933
1.6%
52589
1.0%
51476
0.8%
50728
1.2%
49930
1.6%

ArrivalDateDayOfMonth
Real number (ℝ≥0)

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.79656588
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size466.5 KiB
2021-01-29T22:24:52.826063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.782652219
Coefficient of variation (CV)0.5559849076
Kurtosis-1.186781255
Mean15.79656588
Median Absolute Deviation (MAD)8
Skewness0.0003471236631
Sum942976
Variance77.13498
MonotocityNot monotonic
2021-01-29T22:24:52.969095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
172257
 
3.8%
52196
 
3.7%
152106
 
3.5%
92065
 
3.5%
262054
 
3.4%
252048
 
3.4%
22039
 
3.4%
122028
 
3.4%
162013
 
3.4%
182006
 
3.4%
Other values (21)38883
65.1%
ValueCountFrequency (%)
11786
3.0%
22039
3.4%
31944
3.3%
41856
3.1%
52196
3.7%
ValueCountFrequency (%)
311105
1.9%
301972
3.3%
291775
3.0%
281951
3.3%
271939
3.2%

StaysInWeekendNights
Real number (ℝ≥0)

ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9307144652
Minimum0
Maximum16
Zeros25912
Zeros (%)43.4%
Memory size466.5 KiB
2021-01-29T22:24:53.123129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile2
Maximum16
Range16
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9994422929
Coefficient of variation (CV)1.07384416
Kurtosis5.610020657
Mean0.9307144652
Median Absolute Deviation (MAD)1
Skewness1.315806974
Sum55559
Variance0.9988848968
MonotocityNot monotonic
2021-01-29T22:24:53.260160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
025912
43.4%
216676
27.9%
115352
25.7%
4918
 
1.5%
3651
 
1.1%
681
 
0.1%
546
 
0.1%
833
 
0.1%
78
 
< 0.1%
97
 
< 0.1%
Other values (5)11
 
< 0.1%
ValueCountFrequency (%)
025912
43.4%
115352
25.7%
216676
27.9%
3651
 
1.1%
4918
 
1.5%
ValueCountFrequency (%)
161
 
< 0.1%
141
 
< 0.1%
131
 
< 0.1%
124
< 0.1%
104
< 0.1%

StaysInWeekNights
Real number (ℝ≥0)

ZEROS

Distinct30
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.502119105
Minimum0
Maximum41
Zeros3814
Zeros (%)6.4%
Memory size466.5 KiB
2021-01-29T22:24:53.428198image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum41
Range41
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.909964042
Coefficient of variation (CV)0.7633385788
Kurtosis19.67455136
Mean2.502119105
Median Absolute Deviation (MAD)1
Skewness2.719199321
Sum149364
Variance3.647962642
MonotocityNot monotonic
2021-01-29T22:24:53.575231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
216856
28.2%
115189
25.4%
311012
18.4%
55531
 
9.3%
44848
 
8.1%
03814
 
6.4%
6750
 
1.3%
7532
 
0.9%
10495
 
0.8%
8334
 
0.6%
Other values (20)334
 
0.6%
ValueCountFrequency (%)
03814
 
6.4%
115189
25.4%
216856
28.2%
311012
18.4%
44848
 
8.1%
ValueCountFrequency (%)
411
 
< 0.1%
351
 
< 0.1%
321
 
< 0.1%
303
< 0.1%
261
 
< 0.1%

Adults
Real number (ℝ≥0)

SKEWED

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.859452215
Minimum0
Maximum55
Zeros217
Zeros (%)0.4%
Memory size466.5 KiB
2021-01-29T22:24:53.718263image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum55
Range55
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.597186927
Coefficient of variation (CV)0.3211628253
Kurtosis1529.296426
Mean1.859452215
Median Absolute Deviation (MAD)0
Skewness20.93767433
Sum111000
Variance0.3566322258
MonotocityNot monotonic
2021-01-29T22:24:53.836289image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
244911
75.2%
111378
 
19.1%
33151
 
5.3%
0217
 
0.4%
429
 
< 0.1%
263
 
< 0.1%
271
 
< 0.1%
551
 
< 0.1%
201
 
< 0.1%
401
 
< 0.1%
Other values (2)2
 
< 0.1%
ValueCountFrequency (%)
0217
 
0.4%
111378
 
19.1%
244911
75.2%
33151
 
5.3%
429
 
< 0.1%
ValueCountFrequency (%)
551
 
< 0.1%
401
 
< 0.1%
271
 
< 0.1%
263
< 0.1%
201
 
< 0.1%

Children
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
0.0
55346 
1.0
 
2422
2.0
 
1893
3.0
 
30
nan
 
4

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters179085
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row2.0
ValueCountFrequency (%)
0.055346
92.7%
1.02422
 
4.1%
2.01893
 
3.2%
3.030
 
0.1%
nan4
 
< 0.1%
2021-01-29T22:24:54.139357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:54.238379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.055346
92.7%
1.02422
 
4.1%
2.01893
 
3.2%
3.030
 
0.1%
nan4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0115037
64.2%
.59691
33.3%
12422
 
1.4%
21893
 
1.1%
330
 
< 0.1%
n8
 
< 0.1%
a4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number119382
66.7%
Other Punctuation59691
33.3%
Lowercase Letter12
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
0115037
96.4%
12422
 
2.0%
21893
 
1.6%
330
 
< 0.1%
ValueCountFrequency (%)
n8
66.7%
a4
33.3%
ValueCountFrequency (%)
.59691
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common179073
> 99.9%
Latin12
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
0115037
64.2%
.59691
33.3%
12422
 
1.4%
21893
 
1.1%
330
 
< 0.1%
ValueCountFrequency (%)
n8
66.7%
a4
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII179085
100.0%

Most frequent character per block

ValueCountFrequency (%)
0115037
64.2%
.59691
33.3%
12422
 
1.4%
21893
 
1.1%
330
 
< 0.1%
n8
 
< 0.1%
a4
 
< 0.1%

Babies
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
0
59229 
1
 
458
2
 
8

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters59695
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
059229
99.2%
1458
 
0.8%
28
 
< 0.1%
2021-01-29T22:24:54.558451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:54.652463image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
059229
99.2%
1458
 
0.8%
28
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
059229
99.2%
1458
 
0.8%
28
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number59695
100.0%

Most frequent character per category

ValueCountFrequency (%)
059229
99.2%
1458
 
0.8%
28
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common59695
100.0%

Most frequent character per script

ValueCountFrequency (%)
059229
99.2%
1458
 
0.8%
28
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII59695
100.0%

Most frequent character per block

ValueCountFrequency (%)
059229
99.2%
1458
 
0.8%
28
 
< 0.1%

Meal
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
BB
46118 
HB
7262 
SC
5309 
Undefined
 
598
FB
 
408

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters537255
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFB
2nd rowHB
3rd rowBB
4th rowBB
5th rowBB
ValueCountFrequency (%)
BB 46118
77.3%
HB 7262
 
12.2%
SC 5309
 
8.9%
Undefined598
 
1.0%
FB 408
 
0.7%
2021-01-29T22:24:54.930534image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:55.027556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
bb46118
77.3%
hb7262
 
12.2%
sc5309
 
8.9%
undefined598
 
1.0%
fb408
 
0.7%

Most occurring characters

ValueCountFrequency (%)
413679
77.0%
B99906
 
18.6%
H7262
 
1.4%
S5309
 
1.0%
C5309
 
1.0%
n1196
 
0.2%
d1196
 
0.2%
e1196
 
0.2%
U598
 
0.1%
f598
 
0.1%
Other values (2)1006
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Space Separator413679
77.0%
Uppercase Letter118792
 
22.1%
Lowercase Letter4784
 
0.9%

Most frequent character per category

ValueCountFrequency (%)
B99906
84.1%
H7262
 
6.1%
S5309
 
4.5%
C5309
 
4.5%
U598
 
0.5%
F408
 
0.3%
ValueCountFrequency (%)
n1196
25.0%
d1196
25.0%
e1196
25.0%
f598
12.5%
i598
12.5%
ValueCountFrequency (%)
413679
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common413679
77.0%
Latin123576
 
23.0%

Most frequent character per script

ValueCountFrequency (%)
B99906
80.8%
H7262
 
5.9%
S5309
 
4.3%
C5309
 
4.3%
n1196
 
1.0%
d1196
 
1.0%
e1196
 
1.0%
U598
 
0.5%
f598
 
0.5%
i598
 
0.5%
ValueCountFrequency (%)
413679
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII537255
100.0%

Most frequent character per block

ValueCountFrequency (%)
413679
77.0%
B99906
 
18.6%
H7262
 
1.4%
S5309
 
1.0%
C5309
 
1.0%
n1196
 
0.2%
d1196
 
0.2%
e1196
 
0.2%
U598
 
0.1%
f598
 
0.1%
Other values (2)1006
 
0.2%

Country
Categorical

HIGH CARDINALITY

Distinct155
Distinct (%)0.3%
Missing236
Missing (%)0.4%
Memory size466.5 KiB
PRT
24171 
GBR
6125 
FRA
5217 
ESP
4275 
DEU
3600 
Other values (150)
16071 

Length

Max length3
Median length3
Mean length2.989370827
Min length2

Characters and Unicode

Total characters177745
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)< 0.1%

Sample

1st rowPRT
2nd rowPRT
3rd rowGBR
4th rowPRT
5th rowGBR
ValueCountFrequency (%)
PRT24171
40.5%
GBR6125
 
10.3%
FRA5217
 
8.7%
ESP4275
 
7.2%
DEU3600
 
6.0%
ITA1856
 
3.1%
IRL1665
 
2.8%
BEL1243
 
2.1%
BRA1135
 
1.9%
NLD1065
 
1.8%
Other values (145)9107
 
15.3%
2021-01-29T22:24:55.390638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prt24171
40.7%
gbr6125
 
10.3%
fra5217
 
8.8%
esp4275
 
7.2%
deu3600
 
6.1%
ita1856
 
3.1%
irl1665
 
2.8%
bel1243
 
2.1%
bra1135
 
1.9%
nld1065
 
1.8%
Other values (145)9107
 
15.3%

Most occurring characters

ValueCountFrequency (%)
R40360
22.7%
P29139
16.4%
T26982
15.2%
A10838
 
6.1%
E10806
 
6.1%
B8672
 
4.9%
S6966
 
3.9%
G6651
 
3.7%
U6599
 
3.7%
F5484
 
3.1%
Other values (16)25248
14.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter177745
100.0%

Most frequent character per category

ValueCountFrequency (%)
R40360
22.7%
P29139
16.4%
T26982
15.2%
A10838
 
6.1%
E10806
 
6.1%
B8672
 
4.9%
S6966
 
3.9%
G6651
 
3.7%
U6599
 
3.7%
F5484
 
3.1%
Other values (16)25248
14.2%

Most occurring scripts

ValueCountFrequency (%)
Latin177745
100.0%

Most frequent character per script

ValueCountFrequency (%)
R40360
22.7%
P29139
16.4%
T26982
15.2%
A10838
 
6.1%
E10806
 
6.1%
B8672
 
4.9%
S6966
 
3.9%
G6651
 
3.7%
U6599
 
3.7%
F5484
 
3.1%
Other values (16)25248
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII177745
100.0%

Most frequent character per block

ValueCountFrequency (%)
R40360
22.7%
P29139
16.4%
T26982
15.2%
A10838
 
6.1%
E10806
 
6.1%
B8672
 
4.9%
S6966
 
3.9%
G6651
 
3.7%
U6599
 
3.7%
F5484
 
3.1%
Other values (16)25248
14.2%

MarketSegment
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
Online TA
28322 
Offline TA/TO
12084 
Groups
9826 
Direct
6323 
Corporate
 
2656
Other values (3)
 
484

Length

Max length13
Median length9
Mean length9.021224558
Min length6

Characters and Unicode

Total characters538522
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGroups
2nd rowGroups
3rd rowOffline TA/TO
4th rowOnline TA
5th rowOnline TA
ValueCountFrequency (%)
Online TA28322
47.4%
Offline TA/TO12084
20.2%
Groups9826
 
16.5%
Direct6323
 
10.6%
Corporate2656
 
4.4%
Complementary372
 
0.6%
Aviation110
 
0.2%
Undefined2
 
< 0.1%
2021-01-29T22:24:55.708708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:55.822734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
online28322
28.3%
ta28322
28.3%
ta/to12084
12.1%
offline12084
12.1%
groups9826
 
9.8%
direct6323
 
6.3%
corporate2656
 
2.7%
complementary372
 
0.4%
aviation110
 
0.1%
undefined2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n69214
12.9%
O52490
9.7%
T52490
9.7%
e50133
9.3%
i46951
8.7%
l40778
7.6%
A40516
7.5%
40406
7.5%
f24170
 
4.5%
r21833
 
4.1%
Other values (16)99541
18.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter321357
59.7%
Uppercase Letter164675
30.6%
Space Separator40406
 
7.5%
Other Punctuation12084
 
2.2%

Most frequent character per category

ValueCountFrequency (%)
n69214
21.5%
e50133
15.6%
i46951
14.6%
l40778
12.7%
f24170
 
7.5%
r21833
 
6.8%
o15620
 
4.9%
p12854
 
4.0%
u9826
 
3.1%
s9826
 
3.1%
Other values (7)20152
 
6.3%
ValueCountFrequency (%)
O52490
31.9%
T52490
31.9%
A40516
24.6%
G9826
 
6.0%
D6323
 
3.8%
C3028
 
1.8%
U2
 
< 0.1%
ValueCountFrequency (%)
40406
100.0%
ValueCountFrequency (%)
/12084
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin486032
90.3%
Common52490
 
9.7%

Most frequent character per script

ValueCountFrequency (%)
n69214
14.2%
O52490
10.8%
T52490
10.8%
e50133
10.3%
i46951
9.7%
l40778
8.4%
A40516
8.3%
f24170
 
5.0%
r21833
 
4.5%
o15620
 
3.2%
Other values (14)71837
14.8%
ValueCountFrequency (%)
40406
77.0%
/12084
 
23.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII538522
100.0%

Most frequent character per block

ValueCountFrequency (%)
n69214
12.9%
O52490
9.7%
T52490
9.7%
e50133
9.3%
i46951
8.7%
l40778
7.6%
A40516
7.5%
40406
7.5%
f24170
 
4.5%
r21833
 
4.1%
Other values (16)99541
18.5%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
TA/TO
48916 
Direct
7344 
Corporate
 
3341
GDS
 
89
Undefined
 
5

Length

Max length9
Median length5
Mean length5.344249937
Min length3

Characters and Unicode

Total characters319025
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDirect
2nd rowTA/TO
3rd rowTA/TO
4th rowTA/TO
5th rowTA/TO
ValueCountFrequency (%)
TA/TO48916
81.9%
Direct7344
 
12.3%
Corporate3341
 
5.6%
GDS89
 
0.1%
Undefined5
 
< 0.1%
2021-01-29T22:24:56.200819image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:56.306833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
ta/to48916
81.9%
direct7344
 
12.3%
corporate3341
 
5.6%
gds89
 
0.1%
undefined5
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
T97832
30.7%
A48916
15.3%
/48916
15.3%
O48916
15.3%
r14026
 
4.4%
e10695
 
3.4%
t10685
 
3.3%
D7433
 
2.3%
i7349
 
2.3%
c7344
 
2.3%
Other values (10)16913
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter206621
64.8%
Lowercase Letter63488
 
19.9%
Other Punctuation48916
 
15.3%

Most frequent character per category

ValueCountFrequency (%)
r14026
22.1%
e10695
16.8%
t10685
16.8%
i7349
11.6%
c7344
11.6%
o6682
10.5%
p3341
 
5.3%
a3341
 
5.3%
n10
 
< 0.1%
d10
 
< 0.1%
ValueCountFrequency (%)
T97832
47.3%
A48916
23.7%
O48916
23.7%
D7433
 
3.6%
C3341
 
1.6%
G89
 
< 0.1%
S89
 
< 0.1%
U5
 
< 0.1%
ValueCountFrequency (%)
/48916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin270109
84.7%
Common48916
 
15.3%

Most frequent character per script

ValueCountFrequency (%)
T97832
36.2%
A48916
18.1%
O48916
18.1%
r14026
 
5.2%
e10695
 
4.0%
t10685
 
4.0%
D7433
 
2.8%
i7349
 
2.7%
c7344
 
2.7%
o6682
 
2.5%
Other values (9)10231
 
3.8%
ValueCountFrequency (%)
/48916
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII319025
100.0%

Most frequent character per block

ValueCountFrequency (%)
T97832
30.7%
A48916
15.3%
/48916
15.3%
O48916
15.3%
r14026
 
4.4%
e10695
 
3.4%
t10685
 
3.3%
D7433
 
2.3%
i7349
 
2.3%
c7344
 
2.3%
Other values (10)16913
 
5.3%

IsRepeatedGuest
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
0
57737 
1
 
1958

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters59695
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0
ValueCountFrequency (%)
057737
96.7%
11958
 
3.3%
2021-01-29T22:24:56.614911image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:56.709933image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
057737
96.7%
11958
 
3.3%

Most occurring characters

ValueCountFrequency (%)
057737
96.7%
11958
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number59695
100.0%

Most frequent character per category

ValueCountFrequency (%)
057737
96.7%
11958
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
Common59695
100.0%

Most frequent character per script

ValueCountFrequency (%)
057737
96.7%
11958
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII59695
100.0%

Most frequent character per block

ValueCountFrequency (%)
057737
96.7%
11958
 
3.3%

PreviousCancellations
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.088064327
Minimum0
Maximum26
Zeros56480
Zeros (%)94.6%
Memory size466.5 KiB
2021-01-29T22:24:56.804954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum26
Range26
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8557465581
Coefficient of variation (CV)9.717289477
Kurtosis649.0697173
Mean0.088064327
Median Absolute Deviation (MAD)0
Skewness23.97962684
Sum5257
Variance0.7323021716
MonotocityNot monotonic
2021-01-29T22:24:56.939984image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
056480
94.6%
12987
 
5.0%
265
 
0.1%
331
 
0.1%
1121
 
< 0.1%
2420
 
< 0.1%
2516
 
< 0.1%
514
 
< 0.1%
2613
 
< 0.1%
1912
 
< 0.1%
Other values (4)36
 
0.1%
ValueCountFrequency (%)
056480
94.6%
12987
 
5.0%
265
 
0.1%
331
 
0.1%
410
 
< 0.1%
ValueCountFrequency (%)
2613
< 0.1%
2516
< 0.1%
2420
< 0.1%
1912
< 0.1%
146
 
< 0.1%

PreviousBookingsNotCanceled
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct55
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.136745121
Minimum0
Maximum70
Zeros57855
Zeros (%)96.9%
Memory size466.5 KiB
2021-01-29T22:24:57.101012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum70
Range70
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.487479995
Coefficient of variation (CV)10.87775552
Kurtosis751.5324858
Mean0.136745121
Median Absolute Deviation (MAD)0
Skewness23.41816044
Sum8163
Variance2.212596736
MonotocityNot monotonic
2021-01-29T22:24:57.279060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
057855
96.9%
1803
 
1.3%
2299
 
0.5%
3155
 
0.3%
4115
 
0.2%
595
 
0.2%
654
 
0.1%
743
 
0.1%
835
 
0.1%
930
 
0.1%
Other values (45)211
 
0.4%
ValueCountFrequency (%)
057855
96.9%
1803
 
1.3%
2299
 
0.5%
3155
 
0.3%
4115
 
0.2%
ValueCountFrequency (%)
701
< 0.1%
681
< 0.1%
641
< 0.1%
631
< 0.1%
611
< 0.1%

ReservedRoomType
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
A
42831 
D
9679 
E
 
3257
F
 
1479
G
 
1053
Other values (5)
 
1396

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters955120
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowB
ValueCountFrequency (%)
A 42831
71.7%
D 9679
 
16.2%
E 3257
 
5.5%
F 1479
 
2.5%
G 1053
 
1.8%
B 608
 
1.0%
C 473
 
0.8%
H 307
 
0.5%
L 5
 
< 0.1%
P 3
 
< 0.1%
2021-01-29T22:24:57.610134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:57.715158image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
a42831
71.7%
d9679
 
16.2%
e3257
 
5.5%
f1479
 
2.5%
g1053
 
1.8%
b608
 
1.0%
c473
 
0.8%
h307
 
0.5%
l5
 
< 0.1%
p3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
895425
93.8%
A42831
 
4.5%
D9679
 
1.0%
E3257
 
0.3%
F1479
 
0.2%
G1053
 
0.1%
B608
 
0.1%
C473
 
< 0.1%
H307
 
< 0.1%
L5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator895425
93.8%
Uppercase Letter59695
 
6.2%

Most frequent character per category

ValueCountFrequency (%)
A42831
71.7%
D9679
 
16.2%
E3257
 
5.5%
F1479
 
2.5%
G1053
 
1.8%
B608
 
1.0%
C473
 
0.8%
H307
 
0.5%
L5
 
< 0.1%
P3
 
< 0.1%
ValueCountFrequency (%)
895425
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common895425
93.8%
Latin59695
 
6.2%

Most frequent character per script

ValueCountFrequency (%)
A42831
71.7%
D9679
 
16.2%
E3257
 
5.5%
F1479
 
2.5%
G1053
 
1.8%
B608
 
1.0%
C473
 
0.8%
H307
 
0.5%
L5
 
< 0.1%
P3
 
< 0.1%
ValueCountFrequency (%)
895425
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII955120
100.0%

Most frequent character per block

ValueCountFrequency (%)
895425
93.8%
A42831
 
4.5%
D9679
 
1.0%
E3257
 
0.3%
F1479
 
0.2%
G1053
 
0.1%
B608
 
0.1%
C473
 
< 0.1%
H307
 
< 0.1%
L5
 
< 0.1%

AssignedRoomType
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
A
36878 
D
12736 
E
3880 
F
 
1948
G
 
1280
Other values (7)
 
2973

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters955120
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowA
2nd rowA
3rd rowD
4th rowA
5th rowB
ValueCountFrequency (%)
A 36878
61.8%
D 12736
 
21.3%
E 3880
 
6.5%
F 1948
 
3.3%
G 1280
 
2.1%
C 1207
 
2.0%
B 1081
 
1.8%
H 371
 
0.6%
I 167
 
0.3%
K 143
 
0.2%
Other values (2)4
 
< 0.1%
2021-01-29T22:24:58.129251image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a36878
61.8%
d12736
 
21.3%
e3880
 
6.5%
f1948
 
3.3%
g1280
 
2.1%
c1207
 
2.0%
b1081
 
1.8%
h371
 
0.6%
i167
 
0.3%
k143
 
0.2%
Other values (2)4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
895425
93.8%
A36878
 
3.9%
D12736
 
1.3%
E3880
 
0.4%
F1948
 
0.2%
G1280
 
0.1%
C1207
 
0.1%
B1081
 
0.1%
H371
 
< 0.1%
I167
 
< 0.1%
Other values (3)147
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator895425
93.8%
Uppercase Letter59695
 
6.2%

Most frequent character per category

ValueCountFrequency (%)
A36878
61.8%
D12736
 
21.3%
E3880
 
6.5%
F1948
 
3.3%
G1280
 
2.1%
C1207
 
2.0%
B1081
 
1.8%
H371
 
0.6%
I167
 
0.3%
K143
 
0.2%
Other values (2)4
 
< 0.1%
ValueCountFrequency (%)
895425
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common895425
93.8%
Latin59695
 
6.2%

Most frequent character per script

ValueCountFrequency (%)
A36878
61.8%
D12736
 
21.3%
E3880
 
6.5%
F1948
 
3.3%
G1280
 
2.1%
C1207
 
2.0%
B1081
 
1.8%
H371
 
0.6%
I167
 
0.3%
K143
 
0.2%
Other values (2)4
 
< 0.1%
ValueCountFrequency (%)
895425
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII955120
100.0%

Most frequent character per block

ValueCountFrequency (%)
895425
93.8%
A36878
 
3.9%
D12736
 
1.3%
E3880
 
0.4%
F1948
 
0.2%
G1280
 
0.1%
C1207
 
0.1%
B1081
 
0.1%
H371
 
< 0.1%
I167
 
< 0.1%
Other values (3)147
 
< 0.1%

BookingChanges
Real number (ℝ≥0)

ZEROS

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2225982076
Minimum0
Maximum21
Zeros50647
Zeros (%)84.8%
Memory size466.5 KiB
2021-01-29T22:24:58.263281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6667099906
Coefficient of variation (CV)2.9951274
Kurtosis95.18461672
Mean0.2225982076
Median Absolute Deviation (MAD)0
Skewness6.561586101
Sum13288
Variance0.4445022116
MonotocityNot monotonic
2021-01-29T22:24:58.391309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
050647
84.8%
16345
 
10.6%
21924
 
3.2%
3453
 
0.8%
4185
 
0.3%
552
 
0.1%
641
 
0.1%
715
 
< 0.1%
89
 
< 0.1%
105
 
< 0.1%
Other values (10)19
 
< 0.1%
ValueCountFrequency (%)
050647
84.8%
16345
 
10.6%
21924
 
3.2%
3453
 
0.8%
4185
 
0.3%
ValueCountFrequency (%)
211
< 0.1%
201
< 0.1%
181
< 0.1%
161
< 0.1%
152
< 0.1%

DepositType
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
No Deposit
52453 
Non Refund
7157 
Refundable
 
85

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters895425
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Deposit
2nd rowNo Deposit
3rd rowNo Deposit
4th rowNo Deposit
5th rowNo Deposit
ValueCountFrequency (%)
No Deposit 52453
87.9%
Non Refund 7157
 
12.0%
Refundable 85
 
0.1%
2021-01-29T22:24:58.703379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:24:58.799401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
no52453
44.0%
deposit52453
44.0%
refund7157
 
6.0%
non7157
 
6.0%
refundable85
 
0.1%

Most occurring characters

ValueCountFrequency (%)
358085
40.0%
o112063
 
12.5%
e59780
 
6.7%
N59610
 
6.7%
D52453
 
5.9%
p52453
 
5.9%
s52453
 
5.9%
i52453
 
5.9%
t52453
 
5.9%
n14399
 
1.6%
Other values (7)29223
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter418035
46.7%
Space Separator358085
40.0%
Uppercase Letter119305
 
13.3%

Most frequent character per category

ValueCountFrequency (%)
o112063
26.8%
e59780
14.3%
p52453
12.5%
s52453
12.5%
i52453
12.5%
t52453
12.5%
n14399
 
3.4%
f7242
 
1.7%
u7242
 
1.7%
d7242
 
1.7%
Other values (3)255
 
0.1%
ValueCountFrequency (%)
N59610
50.0%
D52453
44.0%
R7242
 
6.1%
ValueCountFrequency (%)
358085
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin537340
60.0%
Common358085
40.0%

Most frequent character per script

ValueCountFrequency (%)
o112063
20.9%
e59780
11.1%
N59610
11.1%
D52453
9.8%
p52453
9.8%
s52453
9.8%
i52453
9.8%
t52453
9.8%
n14399
 
2.7%
R7242
 
1.3%
Other values (6)21981
 
4.1%
ValueCountFrequency (%)
358085
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII895425
100.0%

Most frequent character per block

ValueCountFrequency (%)
358085
40.0%
o112063
 
12.5%
e59780
 
6.7%
N59610
 
6.7%
D52453
 
5.9%
p52453
 
5.9%
s52453
 
5.9%
i52453
 
5.9%
t52453
 
5.9%
n14399
 
1.6%
Other values (7)29223
 
3.3%

Agent
Categorical

HIGH CARDINALITY

Distinct296
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
9
16128 
NULL
8161 
240
6923 
1
3585 
14
 
1810
Other values (291)
23088 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters656645
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)0.1%

Sample

1st row 305
2nd row NULL
3rd row 6
4th row 240
5th row 9
ValueCountFrequency (%)
916128
27.0%
NULL8161
13.7%
2406923
11.6%
13585
 
6.0%
141810
 
3.0%
71766
 
3.0%
61653
 
2.8%
2501398
 
2.3%
241844
 
1.4%
28797
 
1.3%
Other values (286)16630
27.9%
2021-01-29T22:24:59.148479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
916128
27.0%
null8161
13.7%
2406923
11.6%
13585
 
6.0%
141810
 
3.0%
71766
 
3.0%
61653
 
2.8%
2501398
 
2.3%
241844
 
1.4%
28797
 
1.3%
Other values (286)16630
27.9%

Most occurring characters

ValueCountFrequency (%)
529522
80.6%
919250
 
2.9%
216337
 
2.5%
L16322
 
2.5%
413514
 
2.1%
113360
 
2.0%
010141
 
1.5%
N8161
 
1.2%
U8161
 
1.2%
35429
 
0.8%
Other values (4)16448
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Space Separator529522
80.6%
Decimal Number94479
 
14.4%
Uppercase Letter32644
 
5.0%

Most frequent character per category

ValueCountFrequency (%)
919250
20.4%
216337
17.3%
413514
14.3%
113360
14.1%
010141
10.7%
35429
 
5.7%
74316
 
4.6%
54245
 
4.5%
84040
 
4.3%
63847
 
4.1%
ValueCountFrequency (%)
L16322
50.0%
N8161
25.0%
U8161
25.0%
ValueCountFrequency (%)
529522
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common624001
95.0%
Latin32644
 
5.0%

Most frequent character per script

ValueCountFrequency (%)
529522
84.9%
919250
 
3.1%
216337
 
2.6%
413514
 
2.2%
113360
 
2.1%
010141
 
1.6%
35429
 
0.9%
74316
 
0.7%
54245
 
0.7%
84040
 
0.6%
ValueCountFrequency (%)
L16322
50.0%
N8161
25.0%
U8161
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII656645
100.0%

Most frequent character per block

ValueCountFrequency (%)
529522
80.6%
919250
 
2.9%
216337
 
2.5%
L16322
 
2.5%
413514
 
2.1%
113360
 
2.0%
010141
 
1.5%
N8161
 
1.2%
U8161
 
1.2%
35429
 
0.8%
Other values (4)16448
 
2.5%

Company
Categorical

HIGH CARDINALITY

Distinct287
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
NULL
56317 
40
 
459
223
 
384
45
 
129
67
 
120
Other values (282)
 
2286

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters656645
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)0.1%

Sample

1st row NULL
2nd row NULL
3rd row NULL
4th row NULL
5th row NULL
ValueCountFrequency (%)
NULL56317
94.3%
40459
 
0.8%
223384
 
0.6%
45129
 
0.2%
67120
 
0.2%
153107
 
0.2%
17476
 
0.1%
28173
 
0.1%
21970
 
0.1%
15465
 
0.1%
Other values (277)1895
 
3.2%
2021-01-29T22:24:59.485554image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
null56317
94.3%
40459
 
0.8%
223384
 
0.6%
45129
 
0.2%
67120
 
0.2%
153107
 
0.2%
17476
 
0.1%
28173
 
0.1%
21970
 
0.1%
15465
 
0.1%
Other values (277)1895
 
3.2%

Most occurring characters

ValueCountFrequency (%)
422435
64.3%
L112634
 
17.2%
N56317
 
8.6%
U56317
 
8.6%
21658
 
0.3%
41380
 
0.2%
31306
 
0.2%
11090
 
0.2%
0875
 
0.1%
5683
 
0.1%
Other values (4)1950
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Space Separator422435
64.3%
Uppercase Letter225268
34.3%
Decimal Number8942
 
1.4%

Most frequent character per category

ValueCountFrequency (%)
21658
18.5%
41380
15.4%
31306
14.6%
11090
12.2%
0875
9.8%
5683
7.6%
7553
 
6.2%
8518
 
5.8%
9468
 
5.2%
6411
 
4.6%
ValueCountFrequency (%)
L112634
50.0%
N56317
25.0%
U56317
25.0%
ValueCountFrequency (%)
422435
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common431377
65.7%
Latin225268
34.3%

Most frequent character per script

ValueCountFrequency (%)
422435
97.9%
21658
 
0.4%
41380
 
0.3%
31306
 
0.3%
11090
 
0.3%
0875
 
0.2%
5683
 
0.2%
7553
 
0.1%
8518
 
0.1%
9468
 
0.1%
ValueCountFrequency (%)
L112634
50.0%
N56317
25.0%
U56317
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII656645
100.0%

Most frequent character per block

ValueCountFrequency (%)
422435
64.3%
L112634
 
17.2%
N56317
 
8.6%
U56317
 
8.6%
21658
 
0.3%
41380
 
0.2%
31306
 
0.2%
11090
 
0.2%
0875
 
0.1%
5683
 
0.1%
Other values (4)1950
 
0.3%

DaysInWaitingList
Real number (ℝ≥0)

ZEROS

Distinct112
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.315319541
Minimum0
Maximum391
Zeros57831
Zeros (%)96.9%
Memory size466.5 KiB
2021-01-29T22:24:59.640579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum391
Range391
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17.38291424
Coefficient of variation (CV)7.507781942
Kurtosis181.6806056
Mean2.315319541
Median Absolute Deviation (MAD)0
Skewness11.75749113
Sum138213
Variance302.1657074
MonotocityNot monotonic
2021-01-29T22:25:00.201715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
057831
96.9%
39113
 
0.2%
5883
 
0.1%
4473
 
0.1%
3167
 
0.1%
3549
 
0.1%
5046
 
0.1%
6242
 
0.1%
4640
 
0.1%
3838
 
0.1%
Other values (102)1313
 
2.2%
ValueCountFrequency (%)
057831
96.9%
16
 
< 0.1%
23
 
< 0.1%
332
 
0.1%
414
 
< 0.1%
ValueCountFrequency (%)
39119
< 0.1%
3798
< 0.1%
3306
 
< 0.1%
2597
 
< 0.1%
23617
< 0.1%

CustomerType
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
Transient
44654 
Transient-Party
12652 
Contract
 
2064
Group
 
325

Length

Max length15
Median length9
Mean length10.21531117
Min length5

Characters and Unicode

Total characters609803
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTransient-Party
2nd rowTransient-Party
3rd rowTransient
4th rowTransient-Party
5th rowTransient-Party
ValueCountFrequency (%)
Transient44654
74.8%
Transient-Party12652
 
21.2%
Contract2064
 
3.5%
Group325
 
0.5%
2021-01-29T22:25:00.527788image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:25:00.617808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
transient44654
74.8%
transient-party12652
 
21.2%
contract2064
 
3.5%
group325
 
0.5%

Most occurring characters

ValueCountFrequency (%)
n116676
19.1%
t74086
12.1%
r72347
11.9%
a72022
11.8%
T57306
9.4%
s57306
9.4%
i57306
9.4%
e57306
9.4%
-12652
 
2.1%
P12652
 
2.1%
Other values (7)20144
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter524804
86.1%
Uppercase Letter72347
 
11.9%
Dash Punctuation12652
 
2.1%

Most frequent character per category

ValueCountFrequency (%)
n116676
22.2%
t74086
14.1%
r72347
13.8%
a72022
13.7%
s57306
10.9%
i57306
10.9%
e57306
10.9%
y12652
 
2.4%
o2389
 
0.5%
c2064
 
0.4%
Other values (2)650
 
0.1%
ValueCountFrequency (%)
T57306
79.2%
P12652
 
17.5%
C2064
 
2.9%
G325
 
0.4%
ValueCountFrequency (%)
-12652
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin597151
97.9%
Common12652
 
2.1%

Most frequent character per script

ValueCountFrequency (%)
n116676
19.5%
t74086
12.4%
r72347
12.1%
a72022
12.1%
T57306
9.6%
s57306
9.6%
i57306
9.6%
e57306
9.6%
P12652
 
2.1%
y12652
 
2.1%
Other values (6)7492
 
1.3%
ValueCountFrequency (%)
-12652
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII609803
100.0%

Most frequent character per block

ValueCountFrequency (%)
n116676
19.1%
t74086
12.1%
r72347
11.9%
a72022
11.8%
T57306
9.4%
s57306
9.4%
i57306
9.4%
e57306
9.4%
-12652
 
2.1%
P12652
 
2.1%
Other values (7)20144
 
3.3%

ADR
Real number (ℝ)

ZEROS

Distinct6269
Distinct (%)10.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.8662779
Minimum-6.38
Maximum508
Zeros967
Zeros (%)1.6%
Memory size466.5 KiB
2021-01-29T22:25:00.769833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-6.38
5-th percentile38.098
Q169.52
median94.5
Q3126
95-th percentile194
Maximum508
Range514.38
Interquartile range (IQR)56.48

Descriptive statistics

Standard deviation48.13930729
Coefficient of variation (CV)0.4725735373
Kurtosis2.044484037
Mean101.8662779
Median Absolute Deviation (MAD)27.9
Skewness1.00229947
Sum6080907.46
Variance2317.392906
MonotocityNot monotonic
2021-01-29T22:25:00.962885image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
621871
 
3.1%
751376
 
2.3%
901278
 
2.1%
651209
 
2.0%
0967
 
1.6%
80944
 
1.6%
95803
 
1.3%
120778
 
1.3%
100769
 
1.3%
85747
 
1.3%
Other values (6259)48953
82.0%
ValueCountFrequency (%)
-6.381
 
< 0.1%
0967
1.6%
0.261
 
< 0.1%
16
 
< 0.1%
1.81
 
< 0.1%
ValueCountFrequency (%)
5081
< 0.1%
451.51
< 0.1%
4501
< 0.1%
397.381
< 0.1%
3882
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
0
56022 
1
 
3659
2
 
13
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters59695
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
056022
93.8%
13659
 
6.1%
213
 
< 0.1%
31
 
< 0.1%
2021-01-29T22:25:01.288958image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:25:01.390981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
056022
93.8%
13659
 
6.1%
213
 
< 0.1%
31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
056022
93.8%
13659
 
6.1%
213
 
< 0.1%
31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number59695
100.0%

Most frequent character per category

ValueCountFrequency (%)
056022
93.8%
13659
 
6.1%
213
 
< 0.1%
31
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common59695
100.0%

Most frequent character per script

ValueCountFrequency (%)
056022
93.8%
13659
 
6.1%
213
 
< 0.1%
31
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII59695
100.0%

Most frequent character per block

ValueCountFrequency (%)
056022
93.8%
13659
 
6.1%
213
 
< 0.1%
31
 
< 0.1%

TotalOfSpecialRequests
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5722422313
Minimum0
Maximum5
Zeros35108
Zeros (%)58.8%
Memory size466.5 KiB
2021-01-29T22:25:01.511008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7930812548
Coefficient of variation (CV)1.38591878
Kurtosis1.535222106
Mean0.5722422313
Median Absolute Deviation (MAD)0
Skewness1.352938301
Sum34160
Variance0.6289778768
MonotocityNot monotonic
2021-01-29T22:25:01.640037image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
035108
58.8%
116665
27.9%
26490
 
10.9%
31237
 
2.1%
4171
 
0.3%
524
 
< 0.1%
ValueCountFrequency (%)
035108
58.8%
116665
27.9%
26490
 
10.9%
31237
 
2.1%
4171
 
0.3%
ValueCountFrequency (%)
524
 
< 0.1%
4171
 
0.3%
31237
 
2.1%
26490
 
10.9%
116665
27.9%

ReservationStatus
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
Check-Out
37738 
Canceled
21363 
No-Show
 
594

Length

Max length9
Median length9
Mean length8.622229667
Min length7

Characters and Unicode

Total characters514704
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCheck-Out
2nd rowCanceled
3rd rowCheck-Out
4th rowCheck-Out
5th rowCanceled
ValueCountFrequency (%)
Check-Out37738
63.2%
Canceled21363
35.8%
No-Show594
 
1.0%
2021-01-29T22:25:01.957108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:25:02.077135image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
check-out37738
63.2%
canceled21363
35.8%
no-show594
 
1.0%

Most occurring characters

ValueCountFrequency (%)
e80464
15.6%
C59101
11.5%
c59101
11.5%
h38332
7.4%
-38332
7.4%
k37738
7.3%
O37738
7.3%
u37738
7.3%
t37738
7.3%
a21363
 
4.2%
Other values (7)67059
13.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter378345
73.5%
Uppercase Letter98027
 
19.0%
Dash Punctuation38332
 
7.4%

Most frequent character per category

ValueCountFrequency (%)
e80464
21.3%
c59101
15.6%
h38332
10.1%
k37738
10.0%
u37738
10.0%
t37738
10.0%
a21363
 
5.6%
n21363
 
5.6%
l21363
 
5.6%
d21363
 
5.6%
Other values (2)1782
 
0.5%
ValueCountFrequency (%)
C59101
60.3%
O37738
38.5%
N594
 
0.6%
S594
 
0.6%
ValueCountFrequency (%)
-38332
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin476372
92.6%
Common38332
 
7.4%

Most frequent character per script

ValueCountFrequency (%)
e80464
16.9%
C59101
12.4%
c59101
12.4%
h38332
8.0%
k37738
7.9%
O37738
7.9%
u37738
7.9%
t37738
7.9%
a21363
 
4.5%
n21363
 
4.5%
Other values (6)45696
9.6%
ValueCountFrequency (%)
-38332
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII514704
100.0%

Most frequent character per block

ValueCountFrequency (%)
e80464
15.6%
C59101
11.5%
c59101
11.5%
h38332
7.4%
-38332
7.4%
k37738
7.3%
O37738
7.3%
u37738
7.3%
t37738
7.3%
a21363
 
4.2%
Other values (7)67059
13.0%

ReservationStatusDate
Categorical

HIGH CARDINALITY

Distinct902
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
2015-10-21
 
730
2015-07-06
 
392
2016-11-25
 
387
2015-01-01
 
384
2016-01-18
 
318
Other values (897)
57484 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters596950
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)< 0.1%

Sample

1st row2015-07-17
2nd row2017-01-06
3rd row2016-02-16
4th row2016-10-11
5th row2016-02-20
ValueCountFrequency (%)
2015-10-21730
 
1.2%
2015-07-06392
 
0.7%
2016-11-25387
 
0.6%
2015-01-01384
 
0.6%
2016-01-18318
 
0.5%
2015-07-02231
 
0.4%
2016-12-07221
 
0.4%
2015-12-18214
 
0.4%
2016-02-09190
 
0.3%
2016-04-04189
 
0.3%
Other values (892)56439
94.5%
2021-01-29T22:25:02.441216image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2015-10-21730
 
1.2%
2015-07-06392
 
0.7%
2016-11-25387
 
0.6%
2015-01-01384
 
0.6%
2016-01-18318
 
0.5%
2015-07-02231
 
0.4%
2016-12-07221
 
0.4%
2015-12-18214
 
0.4%
2016-02-09190
 
0.3%
2016-04-04189
 
0.3%
Other values (892)56439
94.5%

Most occurring characters

ValueCountFrequency (%)
0135108
22.6%
-119390
20.0%
1109340
18.3%
293814
15.7%
639657
 
6.6%
729956
 
5.0%
523436
 
3.9%
313436
 
2.3%
811483
 
1.9%
910783
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number477560
80.0%
Dash Punctuation119390
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
0135108
28.3%
1109340
22.9%
293814
19.6%
639657
 
8.3%
729956
 
6.3%
523436
 
4.9%
313436
 
2.8%
811483
 
2.4%
910783
 
2.3%
410547
 
2.2%
ValueCountFrequency (%)
-119390
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common596950
100.0%

Most frequent character per script

ValueCountFrequency (%)
0135108
22.6%
-119390
20.0%
1109340
18.3%
293814
15.7%
639657
 
6.6%
729956
 
5.0%
523436
 
3.9%
313436
 
2.3%
811483
 
1.9%
910783
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII596950
100.0%

Most frequent character per block

ValueCountFrequency (%)
0135108
22.6%
-119390
20.0%
1109340
18.3%
293814
15.7%
639657
 
6.6%
729956
 
5.0%
523436
 
3.9%
313436
 
2.3%
811483
 
1.9%
910783
 
1.8%

Hotel
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size466.5 KiB
City Hotel
39681 
Resort Hotel
20014 

Length

Max length12
Median length10
Mean length10.67054192
Min length10

Characters and Unicode

Total characters636978
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowResort Hotel
2nd rowResort Hotel
3rd rowResort Hotel
4th rowResort Hotel
5th rowCity Hotel
ValueCountFrequency (%)
City Hotel39681
66.5%
Resort Hotel20014
33.5%
2021-01-29T22:25:02.770290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T22:25:02.891317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
hotel59695
50.0%
city39681
33.2%
resort20014
 
16.8%

Most occurring characters

ValueCountFrequency (%)
t119390
18.7%
e79709
12.5%
o79709
12.5%
59695
9.4%
H59695
9.4%
l59695
9.4%
C39681
 
6.2%
i39681
 
6.2%
y39681
 
6.2%
R20014
 
3.1%
Other values (2)40028
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter457893
71.9%
Uppercase Letter119390
 
18.7%
Space Separator59695
 
9.4%

Most frequent character per category

ValueCountFrequency (%)
t119390
26.1%
e79709
17.4%
o79709
17.4%
l59695
13.0%
i39681
 
8.7%
y39681
 
8.7%
s20014
 
4.4%
r20014
 
4.4%
ValueCountFrequency (%)
H59695
50.0%
C39681
33.2%
R20014
 
16.8%
ValueCountFrequency (%)
59695
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin577283
90.6%
Common59695
 
9.4%

Most frequent character per script

ValueCountFrequency (%)
t119390
20.7%
e79709
13.8%
o79709
13.8%
H59695
10.3%
l59695
10.3%
C39681
 
6.9%
i39681
 
6.9%
y39681
 
6.9%
R20014
 
3.5%
s20014
 
3.5%
ValueCountFrequency (%)
59695
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII636978
100.0%

Most frequent character per block

ValueCountFrequency (%)
t119390
18.7%
e79709
12.5%
o79709
12.5%
59695
9.4%
H59695
9.4%
l59695
9.4%
C39681
 
6.2%
i39681
 
6.2%
y39681
 
6.2%
R20014
 
3.1%
Other values (2)40028
 
6.3%

Interactions

2021-01-29T22:24:16.976431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:17.185478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:17.375520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:17.564563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:17.770609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:17.954650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:18.134691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:18.319722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:18.514776image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:18.688815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:18.894861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:19.082894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:19.270945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:19.461160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:19.648029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:19.841063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:20.042117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:20.224159image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:20.400198image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:20.579238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:20.772281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:20.946311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:21.153357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:21.345400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:21.545454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:21.735487image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:21.930541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:22.114582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:22.311616image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:22.485665image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:22.662705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:22.834743image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:23.021785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:23.196825image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:23.389858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:23.566907image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:23.924987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:24.115030image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:24.312074image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:24.499116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:24.698161image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:24.879191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:25.069235image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:25.253285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:25.442318image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:25.617367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:25.824413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:26.003453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:26.190495image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:26.393531image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:26.592585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:26.797631image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:27.006216image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:27.206262image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:27.405305image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:27.605350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:27.813406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:28.000448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:28.221489image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:28.419533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:28.618587image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:28.796617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:28.973666image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:29.144694image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:29.327736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:29.521780image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:29.690827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:29.858864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:30.038906image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:30.208943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:30.402986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:30.571024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:30.745063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:30.928104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:31.109144image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:31.287175image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:31.472217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:31.667269image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:31.843309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:32.023364image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:32.217437image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:32.386485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:32.581528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:32.756559image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:32.939599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:33.129651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:33.311683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:33.493733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:33.676774image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:33.874818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:34.046857image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:34.216005image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:34.411938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:34.820020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:35.020075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:35.202116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:35.380146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:35.584201image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:35.780236image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:35.971287image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:36.168332image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:36.381380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:36.574423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:36.765455image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:36.957509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:37.143540image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:37.358589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:37.550641image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:37.744685image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:37.923725image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:38.099758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:38.271803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:38.444841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:38.634884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:38.798921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:38.958957image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:39.125144image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:39.306035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:39.495077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:39.661114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:39.834144image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:40.034188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:40.235243image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:40.437279image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:40.636333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:40.848380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:41.044424image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:41.236467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:41.435512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:41.647550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:41.841593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:42.039638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:42.243693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:42.426733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:42.607765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:42.785805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:42.966845image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:43.162899image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:43.339929image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:43.518969image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:43.692017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:43.874058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:44.043095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:44.229137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:44.404176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:44.590218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:44.771250image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:44.952299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:45.134340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:45.336385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:45.507423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:45.683573image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:45.862494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:46.047544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:46.215582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-29T22:24:46.413627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-01-29T22:25:03.016345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-01-29T22:25:03.425437image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-01-29T22:25:03.832683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-01-29T22:25:04.262624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-01-29T22:25:04.762727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-01-29T22:24:46.922742image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-01-29T22:24:49.176245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-01-29T22:24:50.129459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexIsCanceledLeadTimeArrivalDateYearArrivalDateMonthArrivalDateWeekNumberArrivalDateDayOfMonthStaysInWeekendNightsStaysInWeekNightsAdultsChildrenBabiesMealCountryMarketSegmentDistributionChannelIsRepeatedGuestPreviousCancellationsPreviousBookingsNotCanceledReservedRoomTypeAssignedRoomTypeBookingChangesDepositTypeAgentCompanyDaysInWaitingListCustomerTypeADRRequiredCarParkingSpacesTotalOfSpecialRequestsReservationStatusReservationStatusDateHotel
04290572015July29150220.00FBPRTGroupsDirect000AA0No Deposit305NULL0Transient-Party107.0000Check-Out2015-07-17Resort Hotel
11116512032017April17250220.00HBPRTGroupsTA/TO000AA0No DepositNULLNULL0Transient-Party90.0000Canceled2017-01-06Resort Hotel
2210430832016February792520.00BBGBROffline TA/TOTA/TO000AD0No Deposit6NULL0Transient26.0000Check-Out2016-02-16Resort Hotel
322591002016October42101010.00BBPRTOnline TATA/TO100AA0No Deposit240NULL0Transient-Party56.0001Check-Out2016-10-11Resort Hotel
45413611872016July2870202.00BBGBROnline TATA/TO000BB0No Deposit9NULL0Transient-Party86.5000Canceled2016-02-20City Hotel
5618731412016December53272520.00BBPRTGroupsTA/TO000AA0No DepositNULLNULL0Transient-Party69.2900Canceled2016-12-12City Hotel
67371912582015July2720220.00BBPRTGroupsTA/TO110AA0No Deposit1NULL0Transient-Party62.8000Canceled2014-10-17City Hotel
7484201312016March13230120.00BBESPOnline TATA/TO000AA0No Deposit9NULL0Transient126.9000Canceled2016-03-18City Hotel
8844330222016February8192520.00BBGBRDirectDirect000AA0No DepositNULLNULL0Transient100.0011Check-Out2016-02-26City Hotel
95142213222016May21191320.00BBPRTOffline TA/TOTA/TO000AA0Non Refund31NULL120Transient80.0000Canceled2016-02-10City Hotel

Last rows

df_indexIsCanceledLeadTimeArrivalDateYearArrivalDateMonthArrivalDateWeekNumberArrivalDateDayOfMonthStaysInWeekendNightsStaysInWeekNightsAdultsChildrenBabiesMealCountryMarketSegmentDistributionChannelIsRepeatedGuestPreviousCancellationsPreviousBookingsNotCanceledReservedRoomTypeAssignedRoomTypeBookingChangesDepositTypeAgentCompanyDaysInWaitingListCustomerTypeADRRequiredCarParkingSpacesTotalOfSpecialRequestsReservationStatusReservationStatusDateHotel
59685788480292015October42140422.00BBFINOnline TATA/TO000FF0No Deposit9NULL0Contract163.5801Check-Out2015-10-18City Hotel
596868369122016September39242122.00BBBELOnline TATA/TO000GG0No Deposit240NULL0Transient165.0000Canceled2016-09-23Resort Hotel
5968711820962015August34171520.00BBPRTOnline TATA/TO000AA0No Deposit240NULL0Transient-Party134.0003Check-Out2015-08-23Resort Hotel
59688913070452016June25171220.00BBPRTGroupsTA/TO000AA0No Deposit1NULL0Transient-Party65.0001Check-Out2016-06-20City Hotel
5968911052201232017April17241320.00BBFRADirectDirect000AA0No Deposit14NULL0Transient105.7501Check-Out2017-04-28City Hotel
59690423020412015September3640120.00HBITAOffline TA/TOTA/TO000AA0No Deposit39NULL38Transient-Party110.0000Check-Out2015-09-05City Hotel
59691482901712016April1541110.00HBDEUGroupsTA/TO000AA1Non Refund298NULL0Transient-Party54.5000Check-Out2016-04-06Resort Hotel
596925478411772016July31260420.00SCROUOnline TATA/TO000AA0No Deposit9NULL0Transient80.7501Canceled2016-03-05City Hotel
596931103260312017April16212320.00BBDEUOffline TA/TOTA/TO000AA0No Deposit21NULL0Transient-Party85.0000Check-Out2017-04-26City Hotel
5969430708002016December4910120.00BBPRTDirectDirect000AA1No DepositNULLNULL0Transient43.0000Check-Out2016-12-02Resort Hotel